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Abstract. In a complex network, different groups of nodes may have existed for different amounts of time. To detect 
the evolutionary history of a network is of great importance. We present a spectral-analysis based method to address this 
fundamental question in network science. In particular, we find that there are complex networks in the real-world for 
which there is a positive correlation between the eigenvalue magnitude and node age. In situations where the network 
topology is unknown but short time series measured from nodes are available, we suggest to uncover the network 
topology at the present (or any given time of interest) by using compressive sensing and then perform the spectral 
analysis. Knowledge of ages of various groups of nodes can provide significant insights into the evolutionary process 
underpinning the network. It should be noted, however, that at the present the applicability of our method is limited 
to the networks for which information about the node age has been encoded gradually in the eigen-properties through 
evolution. 

1 Introduction manner also appears to be the hallmark of other types of net- 

works such as various biological, social and economical net- 



Many large, complex networks in existence today are the re- 
sults of some evolutionary processes such as growth fll. The 
Internet is one best example, which has undergone tremendous 
expansion in the past two decades. Growth in a decentralized 



works (e.g., Facebook). Given a complex network but without 
any knowledge of its evolutionary history, one might be inter- 
ested in the distribution of the "ages" of various nodes or sub- 
groups of nodes in the network. Information about the node 
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ture of the underlying network, and may have significant appli- bility for an existing node to acquire new links is proportional 
cations. For example, in a social network, the lifetimes of cer- to its degree, implying a strong correlation between the node 
tain subgroups of nodes may be closely related to the network degree and its lifetime. Thus, for a scale-free network evolved 
backbone structure in terms of the roles that these subgroups predominantly according to the preferential-attachment rule, 
play in the function of the network, e.g., leadership roles. In a the ages of various nodes can be predicted simply by exam- 
biological network, nodes of longer Ufetimes can be more crit- ining the degrees. However, many real-world networks devi- 
ical to the various functions of the network. It is thus of consid- ate significantly from the scale-free topology [ 1 1 and, for them 
erable interest to develop a systematic method to uncover the the problem of detecting node evolutionary ages is nontrivial. 
evolutionary ages of subgroups of nodes in complex networks. Nonetheless, scale-free networks provide an ideal testbed to 

rr ■, ,■ • i_ jj • .u J . .• 1 validate our spectrum-analysis method. 

Two situations arise when addressing the age-detection prob- ^ ■' 

lem in complex networks: (1) network topology is known and We emphasize that, although our method is suitable even 

(2) the topology is unknown but only time series measured or for networks for which there is no positive correlation between 

observed from various nodes are available. In the first case we node degree and age, its applicability is limited to networks for 

shall establish that the spectrum of the network connectivity which there is a positive correlation between the properties of 

matrix, or the Laplacian matrix, is directly related to the evolu- the eigenmodes and the node age. For networks with which no 

tionary ages of various subgroups of nodes in the network. In evolutionary process can be affiliated, such as various citation 

the second case, we make use of a recently developed method networks and twitter-type of social networks where the impor- 

of time-series based reverse engineering of complex networks tance of a node may not be related with its age, our method is 

||2l to uncover the network topology, and then could analyze the not applicable. 

spectrum of the predicted Laplacian matrix to obtain estimates t c m j -i. ^u • j j i ■ ^i. j 

^ ^ ^ In Sec . |2J we describe the main idea underlying our method. 

of the age distribution of nodes. Our approach thus defines a t c m ij » »u ^u j\, ■ i ^ . i 

° ^^ In Sec. [3| we validate the method by using scale-free networks 

framework in which the problem of evolutionary-age detection . j i. .u . j j c .• i .. u . i j i. 

^ ■' '^ generated by the standard preferential-attachment rule and by 

of nodes in complex networks can be addressed in systematic ., , ,- .- ,,. , . ,. , • n i 

^ ■' the duplication/divergence mechanism, which are especially rel- 

way. While our method does not require a positive correlation ,, ijuii . .iTcm 

-' Mr evant to social and biological systems, respectively. In Sec. |4J 

between the node degree and age, a correlation between the . , i- .• i.- i • i . i .u 

'^ '^ we consider a realistic biological network, the protein-protein 

eigenvalue and the node age is necessary. . ^ . ^ , „ u- u .u j- ^ -i. ^- r j ■ 

'^ o .; interaction network for which the age distribution of nodes is 

It is useful to point out that for the class of scale-free net- available, to further validate our method. In Sec.|5] we address 
works that are generated according to the preferential-attachment the situation where the network topology is not known a pri- 
rule |I3], the problem of evolutionary-age estimation may be ori but only time series are available, make use of the reverse- 
trivial. In particular, this growth rule stipulates that the proba- engineering approach |2 1 to map out the network topology, and 
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demonstrate that the approach yields correctly and accurately A^ = 100 nodes with periodic boundary condition, where each 
the spectrum of the Laplacian matrix. A brief conclusion is pre- node is connected with 2 neighbors on either side so that the 
sented in Sec.|6] node has 4 nearest neighbors. Shown in Fig.[Tla) are represen- 

tative eigenvectors, where the values of N ■ Xf{s) are plotted 
2 Method and Xf(s) is the sth component of the eigenvector X^.We see 



For a complex network of TV nodes, its topological structure 
can be described by the Laplacian matrix L M4I5I6I7I8L where 
the off-diagonal elements of L are Li^j — Lj^t — 1(0) if the 
nodes i and j are connected (disconnected), respectively. The 
diagonal elements are La — — X^ia^i ^ij = ^^j' where ki 
is the number of the nodes connected directly with the node i 
(node degree). The eigenvalues of L are nonnegative and can 
be ranked as = Ai < A2 ■ ■ • < \n- The corresponding eigen- 
vectors are Xi,X2, ■ • • ,Xn, whose wavelengths are sorted 
in a descending order Each eigenvector contains components 
concentrated on various nodes in the network. 

For a regular or a small-world network |9|, the eigenvectors 
typically exhibit some wave patterns with certain wavelengths 
OlOllll . When a perturbation is applied to the network, the af- 
fected eigenvectors are those whose wavelengths match the size 
of the perturbation (i.e., the number of nodes that it affects). 
In this case, some localized structure in the affected eigenvec- 
tors can emerge. Eigenvectors associated with small eigenval- For complex networks that do not possess a regular back- 



that the eigenvectors represent periodic waves of wavelengths 
ranging from N to 2. To observe the effect of local structural 
perturbation on the eigenvectors, we add two more links to 
each node in the group of nodes whose indices are between 
40 and 60 so that each node in this perturbed group now has 
six nearest neighbors. Let A'^ (i = 1, . . . , N) be the eigenvalues 
in the perturbed network. Figure [Ttb) shows some representa- 
tive eigenvectors. We observe that the eigenvectors associated 
with small eigenvalues, e.g., \'i, Ajg, A4Q, Agg, and AgQ, are ba- 
sically unchanged. However, eigenvectors associated with rel- 
atively large eigenvalues, such as A'j^qq, are strongly altered by 
the perturbation but the changes are focused on the perturbed 
group of nodes. Figure [TJc) shows the distribution of the mag- 
nitudes of all eigenvectors on nodes in the network, where we 
see that those associated with eigenvalues Agp to A'j^po ^^^ ^^^' 
sitive to the perturbation with large variations appearing on the 
perturbed nodes. 



ues usually have large wavelengths, and so they are sensitive bone, such as random lfT2l and scale-free |[3l networks, the 

to perturbation on a global scale. In contrast, eigenvectors as- eigenvectors in general do not exhibit any periodic wave struc- 

sociated with large eigenvalues are most sensitive to localized ture. Nonetheless, the observation that the eigenvectors associ- 

perturbations that are applied to a small set of nodes in the net- ated with larger eigenvalues are more sensitive to structural per- 

work. The responses of the eigenvectors to perturbations thus turbations can be used to infer the evolutionary age of nodes, 

reflect the structure of the network at different scales. An ex- To see this, consider a scale-free network evolved according to 

ample is given in Fig.[T]for a one-dimensional regular lattice of the preferential-attachment rule (|3J, for which there is a posi- 
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Fig. 1. (color online.) For a regular ring network of 100 nodes 
where each node has four neighbors, (a) examples of typical, 
periodic-wave like eigenvectors, (b) typical eigenvectors when 
each node in the group of indices between 40 and 60 acquires 
two additional links, one on each side. We observe significant 
distortions from the periodic-wave pattern, which are localized 
between the 40th and 60th components of eigenvectors asso- 
ciated with relatively large eigenvalues, (c) Representation of 
all eigenvectors, where those associated with eigenvalues from 
Agg to A'j^oo ^"^ significantly more sensitive to the structural 
perturbation to the network. 



tive correlation between the node degree and Ufetime. That is, 
nodes of "old" ages tend to have more links and they are thus 
more susceptible to perturbations applied randomly to the net- 
work during the evolutionary process. Since the eigenvectors 
of large eigenvalues are quite sensitive to perturbations (c.f.. 
Fig. [T]i, we expect the large-degree nodes to dominate these 
eigenvectors. As a result, large eigenvalues tend to correspond 
to nodes of long lifetime. This argument suggests that, nodes 
having the most significant components of the eigenvectors as- 
sociated with the largest eigenvalues are likely to possess the 
longest lifetime in the network. 

3 Validation using scale-free networks 

To exemplify the relation between eigenvalues and node ages, 
we consider standard scale-free networks |l3l . Each network has 
N — 2000 nodes, which is evolved following the preferential- 
attachment rule so that the age of the ith node is iV — i + 1 . For 
a given eigenvalue, the lifetime of the associated eigenvector is 
the average age of all nodes contained in the vector, weighted 
by the respective components of the eigenvector Figures |2a- 
c) show the ages of the eigenvectors Xi versus the index i for 
three networks of different edge density w. The significant fea- 
ture common to all three cases is that the average age of the 
nodes dominating some eigenvector increases on average with 
the eigenvalue. The average degree of each eigenvector, i.e., the 
weighted average of the degrees of all nodes associated with 
the vector, shows the same tendency, as shown in Figs.|2jd-f), 
where the average degree is presented on a logarithmic scale. 
For each network, the sizes of the eigenvectors are shown in 
Figs. |2g-i), where the size of an eigenvector is defined to be 
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the number of nodes on which the vector component is larger degree exhibits large fluctuations, as shown in Fig[3d). It is 

than a small threshold value. For sufficiently dense network, thus not possible to obtain information about node age from 

e.g.. Fig. |3i), the size tends to decrease on average with the degree. However, behaviors of the eigenmodes can reveal the 

eigenvalue, indicating that a smaU group of nodes have extraor- age information, as will be demonstrated in Sec. ID 
dinarily long lifetimes in the network and their relative ages 



can be identified simply by examining the associated eigenval- 
ues. Figures |2lj-l) show, for W = 2,4 and 8, respectively, the 
average evolution age versus the node degree. We observe an 
approximately monotonic relation for small degree. However, 
when the node degree is larger than 10, the relation deteriorates 
quickly and the relations approach a constant. 



4 Evolution ages of nodes in a 
protein-protein interaction networl( 



To lend more credence to our proposition that the evolution- 
ary ages of nodes can be inferred from the eigenvalues, we 
now consider a class of networks in systems biology, protein- 
To further demonstrate our method, we have analyzed a protein interaction (PPI) networks. These networks are the re- 
scale-free cellular network generated by mechanism different suit of a number of evolutionary mechanisms such as dupli- 
thanthatofthepreferential-attachmentrule, namely the protein- cations of genes and reattachments of links between the pro- 
protein interaction(PPI) networks. In such a network, duplica- teins. Specifically, we analyze the PPI network of the baker's 
tion and divergence are believed to be responsible for the topo- yeast (Sacchawmyces cerevisiae) 1141151 . Von Mering et al. 
logical structure ||T3|| . We start from a small, connected graph lfT6ll analyzed a total of 80000 interactions among 5400 yeast 
as a seed and duplicate a randomly selected existing protein proteins reported previously and assigned each interaction a 
at each step. The new comer duplicates exactly the connection confidence value. In order to reduce the effect of false posi- 
pattern of its generator in the network. Due to mutations, some tives, we focus on 1 1855 interactions with high and medium 
of the duplicated edges are broken with probability p, while confidence values among 2617 yeast proteins. In a PPI net- 
new edges are generated with probability q between the new work, each protein is a node and each pairwise interaction rep- 
comer and other existing nodes. To compare with the PPI net- resents a link between two nodes. Since our goal is to assess, 
workof the Baker's Yeast (to be described in the next Section), through the eigenvalues, the evolutionary ages of the nodes, 
we generate networks with comparable parameters. In partic- we neglect the directions of the edges. The largest connected 
ular, a typical network has 2235 nodes and average degree of component of the PPI network contains 2235 nodes. In sys- 
10.52, and degree distribution follows power-law with expo- tems biology, the evolutionary processes of the proteins are 
nent 2.3. In a wide range of eigenvalues there exists a strong classified into four iso-temporal groups IfTTll : prokaryotes, eu- 
correlation between the eigenvalue and average age, as shown karya, fungi, and yeast, to which numbers 4, 3, 2 and 1 are as- 
in Fig|3la). We observe that, the curve of average age versus signed according to their evolutionary process from ancient to 
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modern times, respectively. The evolutionary age of a protein 
is the largest number from the groups it presents. For exam- 
ple, the protein YHR037w occurs in the groups prokaryotes(4), 
eukarya(3), fungi(2), which means that it can be found from 
the ancient prokaryotes, so that its age is 4. Figure |4j a) shows 
the average evolutionary age of nodes in eigenvector versus the 
eigenvalue index, which is similar to the behavior in Figs. |2a- 
c). This suggests that for a realistic biological network, there 
is indeed a positive correlation between the eigenvalues of the 
Laplacian matrix and the evolutionary ages of groups of nodes. 
Since PPIs typically possess a scale-free structure 1 18 1, we ex- 
pect the average degree of groups of nodes to exhibit similar 
behaviors as in Figs.|2jd-f). This is indeed the case, as shown 
in Fig.Hfb). The sizes of various eigenvectors are shown in Fig. 
HJc). Again the behavior is similar to those in Figs.|2g-i). From 
Fig-Std), relation of average age versus degree, we see that the 
degree contains no information about the node age. 



Fig. 2. (Color online.) For three scale-free networks gener- 
ated according to the standard preferential-attachment rule with 
edge density w ~ 2,4,8 (corresponding to the left, middle, 
and right column, respectively), (a-c) average ages, (d-f) aver- 
age degree (on a logarithmic scale), and (g-i) size of eigenvec- 
tor versus the eigenvalue index i. Eigenvectors associated with 
large eigenvalues generally have small sizes, but their ages are 
"older" in the network, (j-1) Average age versus degree. We 
see that, while small degree is related with the average age, in- 
formation about node age deteriorates quickly as the degree is 
increased. 



5 Time-series based detection of 
evolutionary ages of nodes 

We now address the situation where the network topology is 
unknown but only time series measured or observed from var- 
ious nodes are available. We shall apply a recently developed 
approach t2J based on compressive sensing II 1 912012 1 I22I23I241 
to uncover the complex-network topology and then could ana- 
lyze the spectrum of the predicted Laplacian matrix to estimate 
the evolutionary ages of nodes. The unique feature of compres- 
sive sensing lies in its extremely low data requirement: very 
little observation is needed to obtain a target sparse signal. In 
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Fig. 3. (Color online.) For scale-free networks generated by 
duplication/divergence-based mechanism from PPI network of 
the Baker's Yeast, (a) average age versus eigenvalue index, 
(b) average degree versus eigenvalue index, and (c) size of 
eigenvector versus the eigenvalue index. Eigenvectors associ- 
ated with large eigenvalues generally have small sizes, but their 
ages are "older" in the network, (d) Average age versus degree. 
Because of large fluctuation, the degree cannot give age-related 
information, but the eigenvalues can. 

general, the problem of compressive sensing can be described 
as to reconstruct a sparse vector a G R^ from linear measure- 
ments X about a in the form: X = G • a, where X G i?*^ 
and G is an M X iV matrix. Accurate reconstruction can be 
achieved by solving the following convex optimization prob- 
lem QD 



min||a||i subject to G ■ a = X, 



(1) 



where ||a|| i = X]i=i \^i I i^ '^^e Li norm of vector a and M ^ 
N,i.e.„ the number of measurements can be much less than the 
number of components of the unknown signal. Various solu- 



Fig. 4. (Color online.) For the largest connected component of 
the PPI network of the baker's yeast with 2235 nodes, (a) the 
evolutionary age, (b) average degree (on a logarithmic scale), 
and (c) size of eigenvector versus the eigenvalue index i. These 
results further indicate that the evolutionary ages of various 
nodes in the network can be inferred from the eigenvalue spec- 
trum of the Laplacian matrix, (d) Average age versus degree. 
We see that degree does not reveal age-related information. 

tions of the convex optimization problem ([T]) have been worked 
out in the applied-mathematics literature II 1 912012 1 I22I23I241 . 

To uncover network topology based on data, it is necessary 
to cast the problem in the form ([T]). The basic hypothesis is 
that a complex networked system can be viewed as a large dy- 
namical system that generates oscillatory time series at various 
nodes. Under this hypothesis, it is straightforward to formulate 
the problem under the compressive-sensing paradigm, details 
of which can be found in Ref. ||2ij. 

To give a concrete example, we consider a real-world net- 
work, the Santa Fe Institute (SFI) collaboration network 11251 . 
There are iV = 76 nodes in the largest connected component 
of the network and the average degree is about 3. A schematic 
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Fig. 5. (Color online.) Schematic illustration of the largest 
component of the SFI collaboration network and the clustered 
structure revealed by an eigenvalue/eigenvector analysis. 

illustration of the network is shown in Fig.|5] A spectral anal- 
ysis reveals that the eigenvectors associated with Ayg, A75 and 
A74 characterize the three hubs: 40, 7 and 67, all marked by 
red. The eigenvector associated with A73 involves a group of 
nodes numbered between 17 and 25 (marked by green). For 
A72, the corresponding eigenvector covers nodes 26 to 29, and 
node 34 (marked by cyan). The three clusters: nodes 41 to 47 
(blue), 1 to 6 (magenta), and 48 to 53 (violet), are represented 
by eigenvectors A70, Agg, and Ags, respectively. In fact, clusters 
of larger scales can be identified for smaller eigenvalues. 

Now assume that the network topology is unknown but an 
oscillatory time series from each node is available. To simu- 
late the situation, we assume that the dynamics of each node 
is described by the chaotic Rossler oscillator |26|. Applying 
the compressive-sensing based method to uncover the network 
topology, we can then perform a spectral analysis to estimate 
the ages of various nodes in the network. Figure |6] shows the 



Fig. 6. (Color online) Sorted eigenvalues of the predicted and 
actual Laplacian matrix of the SFI collaboration network. The 
number of data points used in uncovering the network structure 
is about 40% of the number of total unknown coefficients in the 
power-series expansion. 

eigenvalues of the predicted and the actual Laplacian matrix. 
We observe an excellent agreement. 

6 Conclusions 

In summary, we have developed a procedure to estimate the 
evolutionary ages of nodes in complex networks. The basic ob- 
servation is that eigenvectors associated with different eigen- 
values of the Laplacian matrix can typically represent highly 
localized groups of nodes in the network. A qualitative argu- 
ment can then be made for the existence of positive correlation 
between the node ages and the magnitudes of the eigenvalues. 
This means that, when the network topology is known, a simple 
eigenvalue analysis can lead to reliable information about the 
age distribution of nodes in the network. For situations where 
the network topology is unknown but time series from nodes 
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are available, it is necessary to uncover the topology in order sizes can be uncovered by eigenvectors of smaller eigenvalues. 

to estimate the node ages, and we have demonstrated that this Different eigenmodes can be used to detect clusters of varying 

can be done efficiently using compressive sensing. Examples scales, providing a correlation with the evolutionary ages in sit- 

from model and real-world networks, including a PPI network, uations where hubs or clusters of hubs are formed by history. 

are used to validate our approach. We hope our method to find The principle on which our method is based thus does not take 

appUcations in fields such as systems biology, the propagation into account directionality in the node-to-node interactions. To 

of a rumor, a fashion, a joke, or a flu, where estimating node develop a method to uncover the evolutionary ages for directed 

ages can be of significant value. complex networks remains to be an interesting but open ques- 

„, . , * t- t u • J • 1 • tion at the present. 

The network-reconstruction technique used in our work is ^ 

based on compressive sensing, which works for situations where 

the types of mathematical forms of the nodal dynamical sys- AcknOWledaBITient 

tems and coupling functions are known (although details of 

these functions are not required) and can be represented by se 



ries expansion. So far the method has not been applied to gene- 
regulatory networks due to difficulty to find suitable series ex- 
pansions. The recent method by Hempel et al. |27| is based on 
extracting statistical information and has been demonstrated to 
work well for gene-regulatory networks. 

While many real-world systems such as gene regulatory 
and supply chain networks are directed, our present work fo- 
cused on undirected networks. The main consideration is that 
many networks generated by some kind of evolutionary pro- 
cesses or constructed through experiments tend to undirected. 
For example, the Baker Yeast obtained through the approach 
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